Human epididymis protein 4, a novel potential biomarker for diagnostic and prognosis monitoring of lung cancer

Abstract Objective This study aimed to explore the application value of human epididymis protein 4 (HE4) in diagnosing and monitoring the prognosis of lung cancer. Methods First, TCGA (The Cancer Genome Atlas) databases were used to analyze whey‐acidic‐protein 4‐disulfide bond core domain 2 (WFDC2) gene expression levels in lung cancer tissues. Then, a total of 160 individuals were enrolled, categorized into three groups: the lung cancer group (n = 80), the benign lesions group (n = 40), and the healthy controls group (n = 40). Serum HE4 levels and other biomarkers were quantified using an electro‐chemiluminescent immunoassay. Additionally, the expression of HE4 in tissues was analyzed through immunohistochemistry (IHC). In vitro cultures of human airway epithelial (human bronchial epithelial [HBE]) cells and various lung cancer cell lines (SPC/PC9/A594/H520) were utilized to detect HE4 levels via western blot (WB). Results Analysis of the TCGA and UALCAN (The University of Alabama at Birmingham Cancer Data Analysis Portal) databases showed that WFDC2 gene expression levels were upregulated in lung cancer tissues (p < 0.01). Compared with the control group and the benign group, HE4 was significantly higher in the serum of patients with lung cancer (p < 0.001). Receiver operating characteristic (ROC) analysis confirmed that HE4 had better diagnostic efficacy than classical markers in the differential diagnosis of lung cancer and benign lesions and had the highest diagnostic value in lung adenocarcinoma (area under the ROC curve [AUC] = 0.826). HE4 increased in early lung cancer and positively correlated with poor prognosis (p < 0.001). Moreover, the results of WB and IHC revealed that the expression of HE4 was increased in lung cancer cells (SPC/A549/H520) and lung cancer tissues but decreased in PC9 cells with a lack of exon EGFR19 (p < 0.05). Conclusion Serum HE4 emerges as a promising novel biomarker for the diagnosis and prognosis assessment of lung cancer.

K E Y W O R D S diagnosis, human epididymis protein 4, lung adenocarcinoma, lung cancer, prognosis

| INTRODUCTION
According to the Global Cancer Statistics 2020, lung cancer is the second most commonly diagnosed cancer and remains the leading cause of cancer death, with an estimated 2.2 million new cancer cases and 1.8 million deaths. 1 Lung cancer poses a substantial threat to human health, primarily due to its association with metastasis and recurrence.The early stages of lung cancer often present with subtle clinical symptoms, leading to misdiagnosis as benign conditions.Consequently, many patients are diagnosed with intermediate-to-advanced-stage disease, missing the optimal window for surgical intervention.Currently, three main screening methods for lung cancer exist, including imaging examinations, cytology or histology techniques, and biomarkers.Among these, serum biomarkers are crucial in clinical diagnosis and treatment, given their non-invasive and non-radiative nature.Presently recommended biomarkers for lung cancer include carcinoembryonic antigen (CEA), progastrinreleasing peptide (ProGRP), cytokeratin 19 fragment (Cyfra21-1), neuron-specific enolase (NSE), and squamous cell carcinoma antigen (SCC).However, these markers exhibit certain limitations in sensitivity and specificity for aiding lung cancer diagnosis. 2,3Consequently, there is an urgent need to explore novel biomarkers with better diagnostic efficiency that can provide prognostic information and guide immunotherapy, which would significantly impact identifying individuals who would benefit from early screening and intervention.
Human epididymis protein 4 (HE4), a small secretory protein, is the product of the whey-acidic-protein 4-disulfide bond core domain 2 (WFDC2) gene.WFDC2 belongs to the protease inhibitor family, contributing to protective immunity. 4,5While HE4 has been extensively studied as a biomarker for ovarian cancer, [6][7][8] researchers such as Bingle et al. have shown its expression in various locations, including the oral cavity, nasopharynx, respiratory tract, salivary glands, and lungs, highlighting its broad tissue distribution in 2006. 9Also, a human genome-wide gene expression microarray showed that 231 differentially expressed genes (DEGs) changed in response to HE4, involving the mitogen-activated protein kinase (MAPK) signal, extracellular matrix (ECM) receptor, cell cycle, and steroid biosynthesis pathway. 10oreover, animal experiments demonstrated that the knockout of the WFDC2 gene in mice resulted in severe dyspnea and type I alveolar cell death, suggesting the involvement of WFDC2 in lung function expression. 11he above results indicate that overexpression of HE4 plays an important role in tumor progression and is associated with lung cancer.But up to now, research on HE4 in lung cancer has predominantly focused on serological detection, [12][13][14][15] lacking basic experimental demonstration, and its biological function remains poorly understood.The relationship between HE4 expression and the diagnosis and progression of lung cancer remains unclear.
Therefore, our study first used large public databases to analyze the expression of the WFDC2 gene in lung cancer tissues and then compared HE4 with classical tumor markers at the serological level.Finally, western blot (WB) and immunohistochemistry (IHC) were used to detect the expression characteristics of HE4 in lung cancer cells and tissues.Furthermore, a more comprehensive evaluation was conducted on the application value of HE4 in the diagnosis and prognosis of lung cancer.

| Biological database analysis
For the analysis, publicly available databases, including The Cancer Genome Atlas (TCGA: http://www.cancer.gov/) and The University of Alabama at Birmingham Cancer Data Analysis Portal (UALCAN: http://ualcan.path.uab.edu/analysis.html), were utilized.The search criteria were restricted to "lung cancer" cases, and efforts were made to exclude duplicate data and eliminate entries lacking clinical information.Under these specified conditions, the expression of the "WFDC2" gene was assessed using the respective functionalities provided by these databases.

| Participant characteristics
This study included 160 patients diagnosed at the Nanfang Hospital Zengcheng Campus, Southern Medical University, between January 2020 and August 2023.Three distinct cohorts were established based on diagnostic criteria: the lung cancer cohort (n = 80), the benign lesions cohort (n = 40), and the healthy controls cohort (n = 40).Table 1 provides the basic information for individuals in each group.Simultaneously, cancer tissues and adjacent normal tissues were obtained from eight early-stage lung cancer patients post-surgery, sourced from the Pathology Department at Nanfang Hospital, Southern Medical University.

| Inclusion and exclusion criteria
The inclusion criteria were as follows: (1) lung cancer patients adherence to the Chinese Medical Association's guidelines for Clinical Diagnosis and Treatment of Lung Cancer (2023 version), 16 normal creatinine levels (male < 115 μmol/L, female < 80 μmol/L), and availability of pathological and imaging data for staging and solid tumor response assessment.Solid tumor response was categorized according to RECIST 1.1 criteria as complete response (CR), partial response (PR), stable disease (SD), or progressive disease (PD); (2) the benign group includes patients with benign lung lesions diagnosed by clinicians, such as pneumonia, lung mass, chronic obstructive pulmonary disease, and so forth; (3) the healthy control group came from the physical examination population, which had no underlying disease and no history of medication in the past 1 month.The exclusion criteria encompassed patients with other malignancies or severe underlying diseases, such as severe infections or advanced liver and kidney disorders.

| Serum biomarker measurements
Serum samples were collected using coagulated tubes, separated by centrifugation at 2136 g for 15 min, transferred to new Eppendorf tubes, and stored at À80 C until analysis.The serum levels of HE4, ProGRP, CEA, NSE, Cyfra21-1, and SCC were determined using the electro-chemiluminescent immunoassay method with the Cobas e602 system from Roche, along with its corresponding kit.The experiments adhered to the instrument operating procedures and laboratory Standard Operating Procedures (SOP) standards.The accuracy of the test results was maintained through instrument quality control and calibration processes.These measures ensured the reliability and precision of the obtained data.

| Western blot
The cell lysis process involved a lysis buffer containing a protease inhibitor, a phosphatase inhibitor, and phenylmethylsulfonyl fluoride (PMSF).Following lysis, the cell lysate underwent centrifugation for 15 min at 12 000 g at 4 C.The resulting supernatant was utilized for WB analysis.The protein fraction obtained was boiled and separated via electrophoresis on 12% or 15% sodium dodecyl sulfate (SDS)-polyacrylamide gels.Subsequently, proteins were detected by immunoblotting using the LI-COR Odyssey infrared imaging system (LICOR Bioscience).Band intensities were quantified using Image J software.The antibodies employed in this analysis included HE4 (Santa Cruz, USA, sc-293 473), glyceraldehyde 3-phosphate dehydrogenase (GAPDH) (CST, USA, #5174), and ACTIN (Proteintech, China, 66009-1-Ig).

| Immunohistochemistry
IHC was performed on paraffin sections of tumor tissues obtained from lung cancer patients using the standard LSAB protocol (Dako, USA).The primary antibody against HE4 (dilution 1:50) was employed, with speciesmatched IgG as a negative control.Positive results were indicated by brown staining, and the intensity of the IHC results was quantified using Image Pro Plus 6.0, with average optical density (AOD) used as the measure for analysis.

| Statistical analyses
The chi-square (χ 2 ) test was employed for comparing categorical variables.The distribution of experimental data was assessed using the Shapiro-Wilk test.Two-tailed unpaired Student's t-test or one-way analysis of variance (ANOVA) was utilized for normally distributed data, and the results were presented as mean ± standard deviation (SD).Non-normally distributed data were compared using the Mann-Whitney U test, and the outcomes were described as median (range).The Kruskal-Wallis test was applied to compare three groups with Bonferronicorrected significant values.Receiver operating characteristic (ROC) curve analysis was performed to evaluate the predictive abilities of biomarkers, and the cut-off value, corresponding sensitivity, and specificity were determined using the Youden index.All statistical analyses were conducted using SPSS software Version 26.0 (IBM, USA), and statistical significance was set at p < 0.05.

| The expression of the WFDC2 gene in lung cancer tissues
The analysis of WFDC2 expression in different tumors using the UALCAN database revealed pan-cancer results indicating expression of WFDC2 mRNA levels in a variety of tumor tissues (Figure 1A).To further explore the expression patterns, specifically in lung cancer, mRNA expression data from TCGA were analyzed.The results demonstrated increased mRNA levels of WFDC2 in both overall lung cancer and lung adenocarcinoma (LUAD) (Figure 1B,C), with no significant difference observed in lung squamous cell carcinoma (LUSC) (Figure 1D).

| Comparison of lung cancer-related tumor marker levels in serum
To verify the above hypothesis, we conducted electrochemiluminescent immunoassays to measure the concentrations of HE4 and five classical lung cancer-related tumor markers, including CEA, ProGRP, Cyfra21-1, NSE, and SCC (Table 2).The results revealed a significant increase in HE4 concentration in lung cancer patients compared with benign lesions and healthy controls (p < 0.001).Additionally, serum levels of HE4, Cyfra21-1, and CEA exhibited a notable increase in stages III-IV compared with stages I-II ( p < 0.001).
Based on these results, we suggest that HE4 could serve as a novel and potentially significant biomarker, playing a major role in distinguishing and diagnosing lung cancer.

| Analysis of related factors affecting HE4 levels
To further establish serum HE4 as a suitable biomarker for diagnosing and predicting lung cancer, we analyze related factors that affect HE4 levels, such as gender, smoking, subtype, stage, and disease progression.Interestingly, there were no significant differences observed in serum HE4 levels between females and males or between smokers and non-smokers ( p = 0.200, p = 0.996; Figure 3A).Moreover, serum HE4 levels were elevated in both non-small cell lung cancer (NSCLC) and SCLC compared with benign lesions or healthy controls, with no significant difference observed between NSCLC and SCLC (Figure 3B).Additionally, serum HE4 levels in lung cancer patients at stages I-II were significantly higher than those in healthy controls (p < 0.001), suggesting an elevation of HE4 in the early stages of lung cancer.Subsequently, a substantial increase in HE4 levels was observed in patients at stages III-IV as the disease progressed (Figure 3C).Furthermore, the association between HE4 and disease progression during therapy was evaluated.Higher levels of serum HE4 were associated with disease progression, as patients with PD exhibited significantly higher serum HE4 levels compared with patients in remission or with relatively stable conditions (p < 0.001; Figure 3D).Comparison between lung cancer and healthy control (HE4: p < 0.001, ProGRP: p = 0.007, SCC: p = 0.041, NSE: p = 0.001, Cyfra21-1: p < 0.001, CEA: p < 0.001).b Comparison between lung cancer and benign lesion (HE4: p < 0.001, ProGRP: p = 0.947, SCC: p = 0.023, NSE: p = 0.001, Cyfra21-1: p < 0.001, CEA: p = 0.001). c Comparison between stages I-II and III-IV (HE4: p < 0.001, ProGRP: p = 0.152, SCC: p = 0.767, NSE: p = 0.020, Cyfra21-1: p < 0.001, CEA: p < 0.001).

| The diagnostic value of serum biomarkers for early lung cancer
To explore the potential of HE4 in diagnosing early stages of lung cancer, a subset of 25 patients at stages I-II was selected.The initial screening positive rates based on serum levels of HE4 were compared with those of other classical biomarkers.In this analysis, the positivity rate of HE4 reached 32%, surpassing that of Cyfra21-1 (28%) and NSE (12%).Notably, ProGRP, SCC, and CEA positivity rates were all less than 10% (8%, 8%, and 4%, respectively) (Table 3).

| Comparison of the diagnostic performance of multimarker tests in clinical analysis
Recognizing that classical tumor markers may not be the ideal diagnostic test for lung cancer detection, an innovative approach involving the combination of classical biomarkers with HE4 was explored to enhance sensitivity and accuracy in detection.In this experiment, ProGRP could distinguish between NSCLC and SCLC with an accuracy of 89.71% and 83.33%, respectively.Interestingly, when combined with HE4, the correct diagnostic rate for NSCLC increased to 91.18%, with little change in the correct rate for SCLC.Furthermore, the ProGRP, HE4, and NSE combination improved the overall accuracy to 96.25% (Table 4).

| Expression of HE4 in lung cancer cells and tissues
It had been found that HE4 could exert an important role in lung cancer; thus, we first investigated whether HE4 is specially expressed in lung cancer and other solid tumors.The protein levels of HE4 were further assessed in hepatocellular carcinoma (HCC) cells (HepG2/Huh7/ MHCC97H; Figure 4A) and EC cells (KYSE-150/Eca109; Figure 4B).Unexpectedly, HE4 expression was found to have no statistical significance in EC cells; however, it was downregulated in HCC cells.Notably, the expression of HE4 was significantly increased in lung cancer cells (SPC/A549/H520) but decreased in PC9 cells, possibly due to the loss of EGFR19 exons (E746-A750) (Figure 4C).To further validate these findings, we used IHC analysis and found that the expression levels of HE4 in LUAD and LUSC tissues were higher than those in adjacent tissues (p < 0.05; Figure 4D).

| DISCUSSION
It is well established that the survival time of lung cancer patients is closely linked to the timing of their clinical diagnosis.The 5-year survival rate decreases with disease progression, ranging from 55.5% for stage I to a mere 5.3% for stage IV patients. 17In order to improve diagnostic sensitivity, low-dose computed tomography (LDCT) scans are commonly used in lung cancer screening in numerous regions, exhibiting satisfactory performance.However, recent studies have indicated that LDCT is associated with a high false-positive rate, risks of overdiagnosis, and cumulative radiation exposure.Consequently, it may not be optimal for lung cancer screening in low-risk populations. 18,19While there are still limitations in applying serum biomarkers for lung cancer diagnosis, their detection is simple, safe, low cost, and has a wide clinical audience.As an emerging tumor marker, HE4 has attracted much research attention since its discovery because it is a member of the WFDC family homologous to SLPI and Elafin proteins. 6Studies have shown that WFDC protein is widely expressed in the human epididymis and is associated with lung homeostasis and respiratory diseases. 20Previous studies have reported that HE4 is interconnected with a variety of immune genes and can regulate the immune defense function of the tumor body by participating in the regulation of B cells in the humoral immune microenvironment and the co-expression of secretory leukocyte protease inhibitor genes. 5It can also promote angiogenesis and immunosuppress the tumor microenvironment by regulating signal transducer and activator of transcription 3 (STAT3) target genes. 21At the same time, related studies have found that the HE4-encoding gene WFDC2 can induce the expression of matrix metalloproteinase 2 (MMP2) by activating the AKT signaling pathway, promote epithelial-mesenchymal transition, and then promote the metastasis of ovarian cancer cells. 22Knocking down WFDC2 can inhibit the proliferation, migration, and invasion of A549 cells. 23It has been confirmed that high HE4 expression affects the occurrence and development of tumors, but its biological behavior in lung cancer has been rarely reported, and its use as a lung cancer marker is still in the exploratory stage.
In this study, we first analyzed the TCGA database and determined that WFDC2 was overexpressed in lung cancer tissues, which suggested that WFDC2 may be an oncogene in lung cancer and participate in the regulation of related tumor mechanisms, and its gene product HE4 may play a role in disease monitoring.Then, we selected 160 samples and compared the expression of HE4 with ProGRP, Cyfra21-1, NSE, SCC, and CEA in the three groups of people, confirming that HE4 has the characteristics of characterizing lung cancer.15]24 It was found that in LUAD, the diagnostic efficiency was highest among subtypes, which suggests that there was a tion between HE4 expression and cell type.At the same time, we also observed that HE4 was significantly higher in the serum of NSCLC and SCLC patients than in the benign group.However, there was no statistical difference in the expression levels between the two, indicating that HE4 alone did not exhibit specificity towards the two major pathological subtypes, consistent with the findings reported by W. Liu et al. 25 However, it is essential to note that our cohort study requires the enrollment of more patients for further follow-up analysis.Furthermore, the combined application of HE4 with other biomarkers yielded varied effects.Some studies have reported that ProGRP is highly specific for SCLC. 26This study showed through retrospective case analysis that ProGRP alone can distinguish NSCLC from SCLC with an accuracy of 88.75%, while the combination of ProGRP + HE4 + NSE can accurately distinguish NSCLC from SCLC; the accuracy increased to 96.25%, suggesting that the synergistic effect of HE4 combined with other biomarkers enhances the identification of lung cancer subtypes.This finding may provide more reference for the application of clinical indicator combinations.It has been previously suggested that HE4 is a potential marker for early-stage lung cancer. 3,14,27While the serum levels of HE4 in patients at stages I-II were higher than those in healthy controls, its superiority over benign lesions was not evident.However, a retrospective analysis of HE4 and other biomarkers in patients with early lung cancer revealed a positivity rate of 32%, which still has certain advantages over classical biomarkers.Therefore, to a certain extent, detecting HE4 expression levels may help in the early diagnosis of lung cancer and help screen more beneficiary groups.Simultaneously, we observed a significant increase in HE4 levels in patients with advanced disease (stages III-IV), speculating that elevated HE4 expression may be an important feature of lung cancer progression.Scholars Sun and Mo have reported that lung cancer patients with high HE4 levels have significantly shorter survival times than those with lower levels. 28,29We conducted a RECIST 1.1 assessment on the treated patients and found that the HE4 levels in the PD group were significantly higher than those of patients in the CR, PR, and SD groups, which confirmed our conjecture and also showed that HE4 can be used as a prognostic monitoring indicator for lung cancer. 28urthermore, the serum levels of HE4 in the lung cancer cohort were not influenced by smoking or gender, aligning with the literature. 14,15,29,30The reason may be that the impact of lung cancer progression on HE4 covers gender and subtle changes produced by smoking.
Combined with the pan-cancer analysis results, we selected HCC cells with downregulated expression of WFDC2 and EC cells with upregulated expression for comparative experiments.Through WB detection technology, we found that the expression of HE4 was not increased in both malignant tumor cells.This finding further emphasizes that HE4 is not a universally expressed tumor protein across all cancer types and is consistent with its specificity as a biomarker.Importantly, HE4 levels were upregulated in lung cancer cells (SPC/A549/ H520) but downregulated in PC9 cells.PC9 is a LUAD cell line with EGFR19 exon deficiency and sensitivity to EGFR-TKI drugs such as osimertinib and ametinib.It has been reported that the expression of HE4 in ovarian cancer is related to epidermal growth factor activity and pointed out that HE4 has a potential interaction with epidermal growth factor receptor (EGFR) or other cell surface receptors. 31Related studies have found that knocking down the HE4 gene in lung cancer can lead to inactivation and downregulation of EGFR expression, thereby inhibiting EGFR downstream signaling pathways and the growth and metastasis of lung cancer cells. 32,33oreover, our experiments have found that PC9 cells downregulate EGFR expression and affect HE4 expression due to EGFR gene mutations.In this regard, we speculate that HE4 and EGFR have mutual regulatory effects at the transcription level or the protein level, but the regulatory mechanism still needs further study.Finally, we further confirmed the overexpression of HE4 in lung cancer tissues through IHC.The results showed that, unlike the expression trend of WFDC2 mRNA in the TCGA database, HE4 was highly expressed in both LUAD and LUSC tissues compared with adjacent tissues.This result confirmed the conclusion that HE4 was increased in serum and cell levels and strongly supported the idea of HE4 as a biomarker for lung cancer.Considering that gene transcription and translation is a complex process, however, there are few pathological samples enrolled in LUSC, so there may be a certain degree of bias, and further in-depth research is needed in the future.

| CONCLUSION
Our data suggest that serum HE4 is a better tumor biomarker for diagnosing and predicting lung cancer than other classical markers, particularly in LUAD.Combining HE4 with ProGRP and NSE could further enhance the ability to differentially diagnose NSCLC from SCLC.Although histopathology and imaging remain the gold standard methods for diagnosis, our study substantiates that HE4 would be a novel potential biomarker for diagnosing and predicting early-stage lung cancer, which would offer a novel avenue for therapeutic targeting and present a strategic approach to improve the prognosis of lung cancer patients.

F I G U R E 1
The expression of the WFDC2 gene in lung cancer tissues.(A) Expression of WFDC2 in various tumor tissues.(B-D) Expression of WFDC2 in lung cancer, LUAD, and LUSC, respectively.**p < 0.01; ***p < 0.001; ns, no significance.

F I G U R E 3
Analysis of related factors affecting HE4 levels, such as gender, smoking, subtype, stage, and disease progression.(A) Comparison of serum HE4 in gender and smoking patients.(B) Comparison of serum HE4 in NSCLC and SCLC.(C) Comparison of serum HE4 in stages I-II and III-IV.(D) Comparison of serum HE4 in the different prognoses of lung cancer patients.

F I G U R E 4
Expression of HE4 in hepatocellular carcinoma cells, esophageal cancer cells, lung cancer cells, and tissues.(A) The expression of HE4 was reduced in hepatocellular carcinoma cells.(B) The expression of HE4 in esophageal cancer cells showed no significant difference.(C) The expression of HE4 was upregulated in lung cancer cells (SPC/A549/H520) but downregulated in PC9 cells.(D) The expression of HE4 in lung cancer tissues was higher than that in adjacent tissues.*p < 0.05; **p < 0.01; ***p < 0.001; ns, no significance.
Basic clinical characteristics of enrolled patients.
The median value of serum HE4 and classical biomarkers in different groups.